Neural Nets





Kerry Back

Multi-layer perceptrons

  • A multi-layer perceptron (MLP) consists of “neurons” arranged in layers.
  • A neuron is a mathematical function. It takes inputs \(x_1, \ldots, x_n\), calculates a function \(y=f(x_1, \ldots, x_n)\) and passes \(y\) to the neurons in the next level.
  • The inputs in the first layer are the predictors.
  • The inputs in successive layers are the calculations from the prior level.
  • The last layer is a single neuron that produces the output.

Illustration

  • inputs \(x_1, x_2, x_3, x_4\)
  • variables \(y_1, \ldots, y_5\) are calculated in hidden layer
  • output depends on \(y_1, \ldots, y_5\)

Rectified linear units

  • The usual function for the neurons (except in the last layer) is

\[ y = \max(0,b+w_1x_1 + \cdots + w_nx_n)\]

  • Parameters \(b\) (called bias) and \(w_1, \ldots w_n\) (called weights) are different for different neurons.
  • This function is called a rectified linear unit (RLU).

Analogy to neurons firing

  • If \(w_i>0\) then \(y>0\) only when \(x_i\) are large enough.
  • A neuron fires when it is sufficiently stimulated by signals from other neurons (in prior layer).

Output function

  • The output doesn’t have an option-like truncation, so it can be negative.
  • For regression problems, it is linear:

\[z = b+w_1y_1 + \cdots + w_ny_n\]

  • For classification, there is a linear function for each class and the prediction is the class with the largest value.

Deep learning

  • Deep learning means a neural network with many layers. It is behind facial recognition, self-driving cars, …
  • Need specialized library, probably TensorFlow (from Google) or PyTorch (from Facebook),
  • and probably need graphical processing units (GPUs) – i.e., run on video cards,

Example

  • Use roeq and mom12m for 2021-01 as before.
  • Predict rnk as before.
  • Two hidden layers with 4 nodes in the first and 2 in the second.

Define model

from sklearn.neural_network import MLPRegressor

X = data[["roeq", "mom12m"]]
y = data["rnk"]

model = MLPRegressor(
  hidden_layer_sizes=(4, 2),
  random_state=0
)
model.fit(X, y)

R-squared:

model.score(X,y)
0.05710430450701609


A prediction:

import numpy as np
x = np.array([.1, .4]).reshape(1,2)
model.predict(x)
array([0.50508646])


Save model:

from joblib import dump
dump(model, "net1.joblib")